Ward's Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward's Criterion?

نویسندگان

  • Fionn Murtagh
  • Pierre Legendre
چکیده

The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. Two algorithms are found in the literature and software, both announcing that they implement the Ward clustering method. When applied to the same distance matrix, they produce different results. One algorithm preserves Ward’s criterion, the other does not. Our survey work and case studies will be useful for all those involved in developing software for data analysis using Ward’s hierarchical clustering method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based approach

We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms – Ward’s method, single-link, complete-link, and a variant of group-average – are each equivalent to a hierarchical model-based method. This interpretation gives a theoretical explanation of the empirical b...

متن کامل

Evaluatoin of Agglomerative Hierarchical Clustering Methods

This paper describes the findings from evaluating the performance of agglomerative hierarchical cluster methods for determining seasonal factor groups. Seasonal factor groups are usually determined by traditional cluster analysis based on various similarity measures. Agglomerative hierarchical methods merge telemetry traffic monitoring sites (TTMSs) into groups according to their similarities. ...

متن کامل

Methods for detecting functional classifications in neuroimaging data.

Data-driven statistical methods are useful for examining the spatial organization of human brain function. Cluster analysis is one approach that aims to identify spatial classifications of temporal brain activity profiles. Numerous clustering algorithms are available, and no one method is optimal for all areas of application because an algorithm's performance depends on specific characteristics...

متن کامل

Generalising Ward’s Method for Use with Manhattan Distances

The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperfo...

متن کامل

Efficient Agglomerative clustering Method for Micro Array Data on Breast Cancer Outcome

Analysis of micro arrays presents a number of unique challenges for data mining. The main types of data analysis needed for biomedical applications includeclusteringfinding new biological classes or refining an existing one. We compare the various experimental clustering results of S+ from Insightful, XCluster at Stanford, Eisen’s Cluster, and Rousseau & Kaufman’s Web clusters for single linkag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Classification

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2014